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1 Distributed shared memory in a loosely coupled distributed system 
B. D. Fleisch 

August 1987 ACM SIGCOMM Computer Communication Review , Proceedings of the 
ACM workshop on Frontiers in computer communications technology 
SIGCOMM '87, Volume 17 Issue 5 
Publisher: ACM Press 

Additional Information: ful l cita ti o n, abstract, references, citings, index 
terms 



Full text available: "g| pdf(1.32 MB) 



This work outlines the development and performance validation of an architecture for 
distributed shared memory in a loosely coupled distributed computing environment. This 
distributed shared memory may be used for communication and data exchange between 
communicants on different computing sites; the mechanism will operate transparently 
and in a distributed manner. This paper describes the architecture of this mechanism and 
metrics which will be used to measure its performan ... 

A taxonom y-based comparison of several distributed shared memory systems 
Ming-Chit Tarn, Jonathan M. Smith, David J. Farber 
July 1990 ACM SIGOPS Operating Systems Review, volume 24 issue 3 
Publisher: ACM Press 

Full text available: > Qpdf(!96 MB) Additional Information: full citation , abstract , citings , index terms 

Two possible modes of Input/Output (l/0)are "sequential" and "random-access", and 
there is an extremely strong conceptual link between I/O and communication. Sequential 
communication, typified in the I/O setting by magnetic tape, is typified in the 
communication setting by a stream, e.g., a UNIX 1 pipe. Random-access communication, 
typified in the I/O setting by a drum or disk device, is typified in the communication 
setting by shared memory. In this paper, we study and s ... 



3 Techniques for reducin g consistency-related communication in distributed shared- 
memory systems 

John B. Carter, John K. Bennett, Willy Zwaenepoel 

August 1995 ACM Transactions on Computer Systems (TOCS), volume 13 issue 3 
Publisher: ACM Press 

Additional Information: full citation , abstract , references , citings , index 
terms, review 



Full text available: *gpdf(2.86 MB) 
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Distributed shared memory (DSM) is an abstraction of shared memory on a distributed- 
memory machine. Hardware DSM systems support this abstraction at the architecture 
level; software DSM systems support the abstraction within the runtime system. One of 
the key problems in building an efficient software DSM system is to reduce the amount of 
communication needed to keep the distributed memories consistent. In this article we 
present four techniques for doing so: software release consistency; m ... 

Keywords: cache consistency protocols, distributed shared memory, memory models, 
release consistency, virtual shared memory 



4 A compiler-directed distributed shared memory s ystem 
A> Tzi-cker Chiueh, Manish Verma 

July 1995 Proceedings of the 9th international conference on Supercomputing 

Publisher: ACM Press 

Full text available: *Qpdf(1. 22 MB ) Additional Information: full citation , references , citings , index terms 



5 Source-level global optimizations for fine-grain distributed shared memory systems 
R. Veldema, R. F. H. Hofman, R. A. F. Bhoedjang, C. J. H. Jacobs, H. E. Bal 
June 2001 ACM SIGPLAN Notices , Proceedings of the eighth ACM SIGPLAN 

symposium on Principles and practices of parallel programming PPoPP 

'01, Volume 36 Issue 7 
Publisher: ACM Press 

Full text available* odfd 1 2 60 KB) Additional Information: full citation , abstract , re ferences , citings, index 
' ^ : terms 

This paper describes and evaluates the use of aggressive static analysis in Jackal, a fine- 
grain Distributed Shared Memory (DSM) system for Java. Jackal uses an optimizing, 
source-level compiler rather than the binary rewriting techniques employed by most other 
fine-grain DSM systems. Source-level analysis makes existing access-check optimizations 
(e.g., access-check batching) more effective and enables two novel fine-grain DSM 
optimizations: object-graph aggregatio ... 

6 Preliminary thoughts on problem-oriented shared memory: a decentralized a p proach 
<g> to distributed systems 

^ David R. Cheriton 

October 1985 ACM SIGOPS Operating Systems Review, volume 19 issue 4 
Publisher: ACM Press 

Full text available: ^pdf(1.Q5 MB) Additional Information: full citation, abstract, references, citings 

Much of the work to date on distributed systems has focused on the correct choice of 
communication paradigm, stressing (for example) message primitives, remote procedure 
call, problem- oriented protocols and so on. A distributed system service is then 
implemented as a module executing on particular server machine that is accessed using 
these communication facilities. In contrast, the shared memory paradigm has been used 
on multiprocessor and uniprocessor systems. In the shared memo ... 
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Accurate data redistribution cost estimation in s oftware distributed share d memory 
systems 

Donald G. Morris, David K. Lowenthal 

June 2001 ACM SIGPLAN Notices , Proceedings of the eighth ACM SIGPLAN 

symposium on Principles and practices of parallel programming PPoPP 

'01, Volume 36 Issue 7 
Publisher: ACM Press 
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terms 

Distributing data is one of the key problems in implementing efficient distributed-memory 
parallel programs. The problem becomes more difficult in programs where data 
redistribution between computational phases is considered. The global data distribution 
problem is to find the optimal distribution in multi-phase parallel programs. Solving this 
problem requires accurate knowledge of data redistribution cost We are investigating this 
problem in the context of a sof ... 



An integ rated compile-time/run-time software distributed shared memory s ystem 
Sandhya Dwarkadas, Alan L. Cox, Willy Zwaenepoel 

September 1996 ACM SIGPLAN Notices , ACM SIGOPS Operating Systems Review , 

Proceedings of the seventh international conference on Architectural 
support for programming languages and operating systems ASPLOS- 

VII, Volume 31 , 30 Issue 9 , 5 

Publisher: ACM Press 

Full text available: ffi pdf(1.30 MB ) Additional lnformation: MLcitatjon, abstract, references, citings, index 
^ terms 

On a distributed memory machine, hand-coded message passing leads to the most 
efficient execution, but it is difficult to use. Parallelizing compilers can approach the 
performance of hand-coded message passing by translating data-parallel programs into 
message passing programs, but efficient execution is limited to those programs for which 
precise analysis can be carried out. Shared memory is easier to program than message 
passing and its domain is not constrained by the limitations of paralleli ... 

9 BFXM: a paralle l f ile s ystem model based on the mec hanis m of distributed shared 

^ memory 

Qun Li, Jie Jing, Li Xie . 

October 1997 ACM SIGOPS Operating Systems Review, volume 31 issue 4 
Publisher: ACM Press 

Full text available: ^ pdf(768.69 KB) Additional Information: full citation, abstract, index t e rms 

This paper proposes a parallel file system model under NOWs (network of workstations) 
environment. According to the features of NOWs, the system incorporates the mechanism 
of distributed shared memory, particularly the mechanism of COMA (cache only memory 
access). It links the memory of all nodes into a large cache; each node aggressively uses 
not only the local memory but also the remote memory of other nodes, which expedites 
the data accesses dramatically. It also accesses disks in parallel to ... 

Keywords: cache only memory access, distributed shared memory, parallel file system 



10 Wo rk loa d decomposition for particle simulation applications on hie ra rch ical 

distributed-shared memory parallel systems with inte g ration of HPF and OpenMP 

^ Sergio Briguglio, Beniamino Di Martino, Gregorio Vlad 

June 2001 Proceedings of the 15th international conference on Supercomputing 

Publisher: ACM Press 

. Full text available: ^pdf (194.90 KB) Additional Information: full citation , abstract , references , index terms 

A crucial issue in programming hierarchical distributed-shared memory systems is the 
workload decomposition. In this paper we address this issue in the framework of porting 
typical particle in cell (PIC) applications on hierarchical distributed-shared memory 
parallel systems. The workload decomposition we have devised consists in a two-stage 
procedure: a higher-level decomposition among the computational nodes, and a lower- 
level one among the processors of each computational nod ... 
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11 D istributed shared memor y s ystems with improved barrier s yn chronization and data 
transfer 

Nian-Feng Tzeng, Angkul Kongmunvattana 

July 1997 Proceedings of the 11th international conference on Supercomputing 
Publisher: ACM Press 

Full text available: ^pdf(1.50 MB ) Additional Information: full citation , references , citing s, index terms 



12 An asynchronous protocol for release consistent distributed shared memory systems 
Jaeheung Yeo, Heon Y. Yeom, Taesoon Park 

March 2000 Proceedings of the 2000 ACM symposium on Applied computing - Volume 
2 

Publisher: ACM Press 

Full text available: ^ pdf(889.41 KB) Additional Information: full citation , references , index terms 



Keywords: DSM, asynchronous, release consistency 



13 PLU S: a distributed shared-memory system 
Roberto Bisiani, Mosur Ravishankar 

May 1990 ACM SIGARCH Computer Architecture News , Proceedings of the 17th 

annual international symposium on Computer Architecture ISCA '90, volume 

18 Issue 3a" 

Publisher: ACM Press 

Full text available* fi3 pdf(1.33 MB ) Additional Information: full citation, abs trac t, refe r ences , cit ings, index 
'™ r terms 

PLUS is a multiprocessor architecture tailored to the fast execution of a single 
multithreaded process; its goal is to accelerate the execution of CPU-bound applications. 
PLUS supports shared memory and efficient synchronization. Memory access latency is 
reduced by non-demand replication of pages with hardware-supported coherence between 
replicated pages. The architecture has been simulated in detail and the paper presents 
some of the key measurements that have been used to substantiate our ... 

14 How to share memory in a distributed system j 
Eli Upfal, Avi Wigderson 

January 1987 Journal of the ACM (J ACM), volume 34 issue i 
Publisher: ACM Press 

Full text available* fi fl pdf(960 43 KB) Add ' tional Information: full citation , abstract , references , citings , index 
T^*^—-* : terms , review 

The power of shared-memory in models of parallel computation is studied, and a novel 
distributed data structure that eliminates the need for shared memory without 
significantly increasing the run time of the parallel computation is described. More 
specifically, it is shown how a complete network of processors can deterministically 
simulate one PRAM step in 0(log n/(log log n)2) time when both models use n 

15 A comprehensive bibliography of distributed shared memory | 
M. Rasit Eskicioglu 

January 1996 ACM SIGOPS Operating Systems Review, volume 30 issue l 
Publisher: ACM Press 

Full text available: pdf(2.Q8 MB) Additional Information: full citation , index terms 
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16 Munin: distributed shared memory based on type-specific memory coherence 
J. K. Bennett, J. B. Carter, W. Zwaenepoel 

February 1990 ACM SIGPLAN Notices , Proceedings of the second ACM SIGPLAN 

symposium on Principles & practice of parallel programming PPOPP 

'90, Volume 25 Issue 3 
Publisher: ACM Press 

Full text available* 153 pdf(1 05 MB) Additional Information: full citation , abstract , references , citings , index 
'■^ terms 

We are developing Munin, a system that allows programs written for shared memory 
multiprocessors to be executed efficiently on distributed memory machines. Munin 
attempts to overcome the architectural limitations of shared memory machines, while 
maintaining their advantages in terms of ease of programming. Our system is unique in 
its use of loosely coherent memory, based on the partial order specified by a shared 
memory parallel program, and in its use of type-specific memory coherence. Ins ... 

17 Mira g e: a coherent distributed shared me mor y desig n 
B. Fleisch, G. Popek 

November 1989 ACM SZGOPS Operating Systems Review , Proceedings of the twelfth 
ACM symposium on Operating systems principles SOSP '89, volume 23 

Issue 5 

Publisher: ACM Press 

Full text available: fiB pdf(1 .63 MB) Additional Information: full citation, abstract, references, citings, index 
. i£| : terms 

Shared memory is an effective and efficient paradigm for interprocess communication. We 
are concerned with software that makes use of shared memory in a single site system 
and its extension to a multimachine environment. Here we describe the design of a 
distributed shared memory (DSM) system called Mirage developed at UCLA. Mirage 
provides a form of network transparency to make network boundaries invisible for shared 
memory and is upward compatible with an existing interfac ... 

1 8 CRL: high-performance all-softwa re di str ib u t e d sh ared memory I 
K. L. Johnson, M. F. Kaashoek, D. A. Wallach 

December 1995 ACM SIGOPS Operating Systems Review , Proceedings of the fifteenth 
ACM symposium on Operating systems principles SOSP '95, volume 29 
Issue 5 

Publisher: ACM Press 

Full text available: ^pdf(2.02 MB) Additional Information: full citation , references , citings , index terms 



19 Scalable fault-tolerant distributed shared memory 
Florin Sultan, Liviu Iftode, Thu Nguyen 

November 2000 Proceedings of the 2000 ACM/IEEE conference on Supercomputing 
(CDROM) 

Publisher: IEEE Computer Society 

Full text available: W pdf(247.40 KB ) Additional Information: full citation, abstract, references , citin gs, index 
W Publisher Site !^DS 

This paper shows how a state-of-the-art software distributed shared-memory (DSM) 
protocol can be efficiently extended to tolerate single-node failures. In particular, we 
extend a home-based lazy release consistency (HLRC) DSM system with independent 
checkpointing and logging to volatile memory, targeting shared-memory computing on 
very large LAN-based clusters. In these environments, where global coordination may be 
expensive, independent checkpointing becomes critical to scalability. Howev ... 
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20 Experiences in integrating distributed shared memory with virtual memory 
^ management 

^ R. Ananthanarayanan, Sathis Menon, Ajay Mohindra, Umakishore Ramachandran 
July 1992 ACM SIGOPS Operating Systems Review, volume 26 issue 3 
Publisher: ACM Press 

Full text available: Q pdf(1.56 MB ) Additional Information: full citation, abstract , citin gs, index terms 

While the duality between message-passing and shared memory for interprocess 
communication is well-known, the shared memory paradigm has drawn considerable 
attention in recent times even in distributed systems. Distributed Shared Memory (DSM) 
is the abstraction for supporting the notion of shared memory in a physically non-shared 
(distributed) architecture. It gives a uniform set of mechanisms for accessing local and 
remote memories. Further, by combining shared memory style synchronization with ... 
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1 Pr e li min ary t hou g hts on p roblem-oriented shared memo r y: a decentra liz ed a p proach 
^ to distributed systems . 
^ David R. Cheriton 

October 1985 ACM SIGOPS Operating Systems Review, volume 19 issue 4 

Publisher: ACM Press 

Full text available: ^l] pdf(1.05 MB ) Additional Information: full citation , abstract , r eferen ces, citin gs 

Much of the work to date on distributed systems has focused on the correct choice of 
communication paradigm, stressing (for example) message primitives, remote procedure 
call, problem- oriented protocols and so on. A distributed system service is then 
implemented as a module executing on particular server machine that is accessed using 
these communication facilities. In contrast, the shared memory paradigm has been used 
on multiprocessor and uniprocessor systems. In the shared memo ... 



2 Paradi g ms 2: An architecture for a w i de a rea dist ributed system 
Philip Homburg, Maarten van Steen, Andrew S. Tanenbaum 

September 1996 Proceedings of the 7th workshop on ACM SIGOPS European 
workshop: Systems support for worldwide applications 

Publisher: ACM Press 

Full text available: ^ pdf(658.88 KB) Additional Information: full citation , abstract , references , citings 

Distributed systems provide sharing of resources and information over a computer 
network. A key design issue that makes these systems attractive is that all aspects 
related to distribution are transparent to users. Unfortunately, general-purpose wide area 
distributed systems that allow users to share and manage arbitrary resources in a 
transparent way hardly exist. In particular, they generally do not take into account the 
most important properties that characterize wide area systems: 1) A very ... 



3 "Topologies" — distributed objects on multicomputers 
Karsten Schwan, Win Bo 



May 1990 ACM Transactions on Computer Systems (TOCS), volume 8 issue 2 
Publisher: ACM Press 

Additional Information: full citation , abstract , references , citings , index 
terms , review 



Full text available: Qpdf(3. 83 MB) 



Application programs written for large-scale multicomputers with interconnection 
structures known to the programmer (e.g., hypercubes or meshes) use complex 
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communication structures for connecting the applications' parallel tasks. Such structures 
implement a wide variety of functions, including the exchange of data or control 
information relevant to the task computations and/or the communications required for 
task synchronization, message forwarding/filtering under program control, and so o ... 

4 Exte n di n g the o perating s ystem to su p port an ob j ect-oriented environment 
J. A. Marques, P. Guedes 

September 1989 ACM SIGPLAN Notices , Conference proceedings on Object-oriented 
programming systems, languages and applications OOPSLA '89, 

Volume 24 Issue 10 

Publisher: ACM Press 

Full text available* 1Q_pclf(1 21 MB) Additional Information: full citation , abstract , references , citin gs, index 
' ^ : terms 

Comandos is a project within the European Strategic Programme for Research on 
Information Technology - ESPRIT and it stems from the identified need of providing 
simpler and more integrated environments for application development in large 
distributed systems. The fundamental goal of the project is the definition of an integrated 
platform providing support for distributed and concurrent processing in a LAN 
environment, extensible and distributed data management an ... 

Implementation and performance of Munin 
John B. Carter, John K. Bennett, Willy Zwaenepoel 

September 1991 ACM SIGOPS Operating Systems Review , Proceedings of the 

thirteenth ACM symposium on Operating systems principles SOSP 

'91, Volume 25 Issue 5 
Publisher: ACM Press 

Full text available* 1?l odfd 46 MB) Additional Information: full citation , abstract , references , citinqs . index 

terms 

Munin is a distributed shared memory (DSM) system that allows shared memory parallel 
programs to be executed efficiently on distributed memory multiprocessors. Munin is 
unique among existing DSM systems in its use of multiple consistency protocols and in its 
use of release consistency. In Munin, shared program variables are annotated with their 
expected access pattern, and these annotations are then used by the runtime system to 
choose a consistency protocol best suited to that acc ... 

6 High-speed distributed data handling for on-line instrumentation systems 

William E. Johnston, William Greiman, Gary Hoo, Jason Lee, Brian Tierney, Craig Tull, 
Douglas Olson 

November 1997 Proceedings of the 1997 ACM/IEEE conference on Supercomputing 
(CDROM) 

Publisher: ACM Press 

Full text available:^ pdf(438.36 KB ) Additional Information: full citation , abstract , references 

The advent (and promise) of shared, widely available, high-speed networks provides the 
potential for new approaches to the collection, organization, storage, and analysis of high- 
speed and high-volume data streams from high data-rate, on-line instruments. We have 
worked in this area for several years, have identified and addressed a variety of problems 
associated with this scenario, and have evolved an architecture, implementations, and a 
monitoring methodology that have been successful in addre ... 
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Design of the Mneme persistent object store 
J. Eliot B. Moss 

April 1990 ACM Transactions on Information Systems (TOIS), volume 8 issue 2 
Publisher: ACM Press 

Additional Information: full citation , abstract , references , citinqs . index 
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The Mneme project is an investigation of techniques for integrating programming 
language and database features to provide better support for cooperative, information- 
intensive tasks such as computer-aided software engineering. The project strategy is to 
implement efficient, distributed, persistent programming languages. We report here on 
the Mneme persistent object store, a fundamental component of the project, discussing 
its design and initial prototype. Mneme stores objects 

CLAM- an open system for graphical user interfaces 
Lisa A. Call, David L. Cohrs, Barton P. Miller 

December 1987 ACM SIGPLAN Notices , Conference proceedings on Object-oriented 

programming systems, languages and applications OOPSLA '87, volume 

22 Issue 12 

Publisher: ACM Press 

Full text available* f3 pdf(1.02 MB) Additional Information: full citation, abstract, references, citings, index 
. [£] h terms 

CLAM is an object-oriented system designed to support the building of extensible 
graphical user interfaces. CLAM provides a basic windowing environment with the ability 
to extend its functions using dynamically loaded C++ classes. The dynamically loaded 
classes allow for performance tuning (by transparently loading the class in either the 
client or the CLAM server) and for sharing of new functions. In addition to the traditionally 
layering of output abstractions, CLAM allows the ... 

Office-bv-example: an integrated office system and database manager | 
Kyu-Young Whang, Art Ammann, Anthony Bolmarcich, Maria Hanrahan, Guy Hochgesang, 
Kuan-Tsae Huang, Al Khorasani, Ravi Krishnamurthy, Gary Sockut, Paula Sweeney, Vance 
Waddle, Moshe Zloof 

October 1987 ACM Transactions on Information Systems (TOIS), volume 5 issue 4 
Publisher: ACM Press 

Full text available* 153 pdf(2 86 MB) Additional Information: full citation, abstract, references , citin g s , index 
• ^ : terms , review 

Office-by-Example (OBE) is an integrated office information system that has been under 
development at IBM Research. OBE, an extension of Query-by-Example, supports various 
office features such as database tables, word processing, electronic mail, graphics, 
images, and so forth. These seemingly heterogeneous features are integrated through a 
language feature called example elements. Applications involving example elements are 
processed by the database manager, an integrated ... 

10 Dy nam ic s oftware t esting of MPI a p plicatio ns with umpire j 
Jeffrey S. Vetter, Bronis R. de Supinski 

November 2000 Proceedings of the 2000 ACM/IEEE conference on Supercomputing 
(CDROM) 

Publisher: IEEE Computer Society 

Full text available: fB pdf(83.83 KB) 

Si Additional Information: full citation , abstract , references, index terms 

ffil * Publisher Site 

As evidenced by the popularity of MPI (Message Passing Interface), message passing is an 
effective programming technique for managing coarse-grained concurrency on distributed 
computers. Unfortunately, debugging message-passing applications can be difficult. 
Software complexity, data races, and scheduling dependencies can make programming 
errors challenging to locate with manual, interactive debugging techniques. This article 
describes Umpire, a new tool for detecting programming errors at ... 

11 Distrib ute d sh are d memor y in a loosely cou pled distributed system j 
B. D. Fleisch 



http://portal.acm.org/ra^ 4/26/06 



Results (page 1): +distributed +shared +memory +system "object handle" 



Page 4 of 6 



August 1987 ACM SIGCOMM Computer Communication Review , Proceedings of the 
ACM workshop on Frontiers in computer communications technology 
SIGCOMM '87, Volume 17 Issue 5 
Publisher: ACM Press 

Full text available* fi3 odfd 32 MB) Additional Information: full citation , abstract , references , citings , index 

terms 

This work outlines the development and performance validation of an architecture for 
distributed shared memory in a loosely coupled distributed computing environment. This 
distributed shared memory may be used for communication and data exchange between 
communicants on different computing sites; the mechanism will operate transparently 
and in a distributed manner. This paper describes the architecture of this mechanism and 
metrics which will be used to measure its performan ... 

12 A taxonomy-based comparison of several distributed shared memory systems 
Ming-Chit Tarn, Jonathan M. Smith, David J. Farber 
July 1990 ACM SIGOPS Operating Systems Review, volume 24 issue 3 
Publisher: ACM Press 

Full text available: ^pdf(1.96 MB) Additional Information: full citation, abstract, citings, index terms 

Two possible modes of Input/Output (l/0)are "sequential" and "random-access", and 
there is an extremely strong conceptual link between I/O and communication. Sequential 
communication, typified in the I/O setting by magnetic tape, is typified in the 
communication setting by a stream, e.g., a UNIX 1 pipe. Random-access communication, 
typified in the I/O setting by a drum or disk device, is typified in the communication 
setting by shared memory. In this paper, we study and s ... 

13 Techniq ues for reducin g consistency-related communication in distributed shared- 
^ memory systems 

^ John B. Carter, John K. Bennett, Willy Zwaenepoel 

August 1995 ACM Transactions on Computer Systems (TOCS), volume 13 issue 3 
Publisher: ACM Press 

Full text available* till pdf(2.86 MB) Additional Information: full citation , abstract , references , citings , index 
. [Aj h terms, review 

Distributed shared memory (DSM) is an abstraction of shared memory on a distributed- 
memory machine. Hardware DSM systems support this abstraction at the architecture 
level; software DSM systems support the abstraction within the runtime system. One of 
the key problems in building an efficient software DSM system is to reduce the amount of 
communication needed to keep the distributed memories consistent. In this article we 
present four techniques for doing so: software release consistency; m ... 

Keywords: cache consistency protocols, distributed shared memory, memory models, 
release consistency, virtual shared memory 



14 A compile r-directed distributed shared memory s ystem 
^ Tzi-cker Chiueh, Manish Verma 

>/ July 1995 Proceedings of the 9th international conference on Supercomputing 

Publisher: ACM Press 

Full text available: ^pdf(1.22 MB) Additional Information: full citation , references , citings, index terms 



15 Source-level g lobal optimizations for fine-grain distributed shared memory systems j 
R. Veldema, R. F. H. Hofman, R. A. F. Bhoedjang, C. J. H. Jacobs, H. E. Bal 
June 2001 ACM SIGPLAN Notices , Proceedings of the eighth ACM SIGPLAN 
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symposium on Principles and practices of parallel programming PPoPP 

'01, Volume 36 Issue 7 
Publisher: ACM Press 

Full text available* IB pdf(1 1 2 60 KB) Additional Information: full citation , abstract , references , citings, index 
. [AJ - terms 

This paper describes and evaluates the use of aggressive static analysis in Jackal, a fine- 
grain Distributed Shared Memory (DSM) system for Java. Jackal uses an optimizing, 
source-level compiler rather than the binary rewriting techniques employed by most other 
fine-grain DSM systems. Source-level analysis makes existing access-check optimizations . 
(e.g., access-check batching) more effective and enables two novel fine-grain DSM 
optimizations: object-graph aggregatio ... 

16 Accurate data redistribution cost estimation in software distributed shared memory 
s ystems 

Donald G. Morris, David K. Lowenthal 
June 2001 ACM SIGPLAN Notices , Proceedings of the eighth ACM SIGPLAN 

symposium on Principles and practices of parallel programming PPoPP 
'01, Volume 36 Issue 7 
Publisher: ACM Press 

Full text available* fij pdf(270 58 KB) Additional Information: full citation, abstract, references, citings, index 
• yy • terms 

Distributing data is one of the key problems in implementing efficient distributed-memory 
parallel programs. The problem becomes more difficult in programs where data 
redistribution between computational phases is considered. The global data distribution 
problem is to find the optimal distribution in multi-phase parallel programs. Solving this 
problem requires accurate knowledge of data redistribution cost. We are investigating this 
problem in the context of a sof ... 
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On a distributed memory machine, hand-coded message passing leads to the most 
efficient execution, but it is difficult to use. Parallelizing compilers can approach the 
performance of hand-coded message passing by translating data-parallel programs into 
message passing programs, but efficient execution is limited to those programs for which 
precise analysis can be carried out. Shared memory is easier to program than message 
passing and its domain is not constrained by the limitations of paralleli ... 
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This paper proposes a parallel file system model under NOWs (network of workstations) 
environment. According to the features of NOWs, the system incorporates the mechanism 
. of distributed shared memory, particularly the mechanism of COMA (cache only memory 
access). It links the memory of all nodes into a large cache; each node aggressively uses 
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not only the local memory but also the remote memory of other nodes, which expedites 
the data accesses dramatically. It also accesses disks in parallel to ... 
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A crucial issue in programming hierarchical distributed-shared memory systems is the 
workload decomposition. In this paper we address this issue in the framework of porting 
typical particle in cell (PIC) applications on hierarchical distributed-shared memory 
parallel systems. The workload decomposition we have devised consists in a two-stage 
procedure: a higher-level decomposition among the computational nodes, and a lower- 
level one among the processors of each computational nod ... 
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